Overview

Dataset statistics

Number of variables13
Number of observations7050
Missing cells0
Missing cells (%)0.0%
Duplicate rows53
Duplicate rows (%)0.8%
Total size in memory716.1 KiB
Average record size in memory104.0 B

Variable types

Categorical4
Numeric9

Warnings

Dataset has 53 (0.8%) duplicate rows Duplicates
status_id has a high cardinality: 6997 distinct values High cardinality
num_reactions is highly correlated with num_likesHigh correlation
num_likes is highly correlated with num_reactionsHigh correlation
num_hahas is highly skewed (γ1 = 20.30574123) Skewed
status_id is uniformly distributed Uniform
num_reactions has 121 (1.7%) zeros Zeros
num_comments has 2119 (30.1%) zeros Zeros
num_shares has 3911 (55.5%) zeros Zeros
num_likes has 126 (1.8%) zeros Zeros
num_loves has 4230 (60.0%) zeros Zeros
num_wows has 5308 (75.3%) zeros Zeros
num_hahas has 5916 (83.9%) zeros Zeros
num_sads has 6443 (91.4%) zeros Zeros
num_angrys has 6627 (94.0%) zeros Zeros

Reproduction

Analysis started2021-05-02 10:45:36.047497
Analysis finished2021-05-02 10:45:45.627808
Duration9.58 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

status_id
Categorical

HIGH CARDINALITY
UNIFORM

Distinct6997
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Memory size55.2 KiB
819700534875473_998824716963053
 
2
819700534875473_957697307742461
 
2
819700534875473_1000607730118085
 
2
819700534875473_976401089205416
 
2
819700534875473_963754250470100
 
2
Other values (6992)
7040 

Length

Max length33
Median length31
Mean length31.31531915
Min length31

Characters and Unicode

Total characters220773
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6944 ?
Unique (%)98.5%

Sample

1st row246675545449582_1649696485147474
2nd row246675545449582_1649426988507757
3rd row246675545449582_1648730588577397
4th row246675545449582_1648576705259452
5th row246675545449582_1645700502213739
ValueCountFrequency (%)
819700534875473_9988247169630532
 
< 0.1%
819700534875473_9576973077424612
 
< 0.1%
819700534875473_10006077301180852
 
< 0.1%
819700534875473_9764010892054162
 
< 0.1%
819700534875473_9637542504701002
 
< 0.1%
819700534875473_9939754374479812
 
< 0.1%
819700534875473_9575994477522472
 
< 0.1%
819700534875473_9682646533523932
 
< 0.1%
819700534875473_9998800335241882
 
< 0.1%
819700534875473_9654078703047382
 
< 0.1%
Other values (6987)7030
99.7%
2021-05-02T16:15:45.819784image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
819700534875473_9988247169630532
 
< 0.1%
819700534875473_9576973077424612
 
< 0.1%
819700534875473_10006077301180852
 
< 0.1%
819700534875473_9764010892054162
 
< 0.1%
819700534875473_9637542504701002
 
< 0.1%
819700534875473_9939754374479812
 
< 0.1%
819700534875473_9575994477522472
 
< 0.1%
819700534875473_9682646533523932
 
< 0.1%
819700534875473_9998800335241882
 
< 0.1%
819700534875473_9654078703047382
 
< 0.1%
Other values (6987)7030
99.7%

Most occurring characters

ValueCountFrequency (%)
533235
15.1%
429099
13.2%
823717
10.7%
123706
10.7%
623509
10.6%
218242
8.3%
718191
8.2%
315301
6.9%
014681
6.6%
914042
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number213723
96.8%
Connector Punctuation7050
 
3.2%

Most frequent character per category

ValueCountFrequency (%)
533235
15.6%
429099
13.6%
823717
11.1%
123706
11.1%
623509
11.0%
218242
8.5%
718191
8.5%
315301
7.2%
014681
6.9%
914042
6.6%
ValueCountFrequency (%)
_7050
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common220773
100.0%

Most frequent character per script

ValueCountFrequency (%)
533235
15.1%
429099
13.2%
823717
10.7%
123706
10.7%
623509
10.6%
218242
8.3%
718191
8.2%
315301
6.9%
014681
6.6%
914042
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII220773
100.0%

Most frequent character per block

ValueCountFrequency (%)
533235
15.1%
429099
13.2%
823717
10.7%
123706
10.7%
623509
10.6%
218242
8.3%
718191
8.2%
315301
6.9%
014681
6.6%
914042
6.4%

num_reactions
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct1067
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean230.1171631
Minimum0
Maximum4710
Zeros121
Zeros (%)1.7%
Memory size55.2 KiB
2021-05-02T16:15:45.910188image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q117
median59.5
Q3219
95-th percentile1239.65
Maximum4710
Range4710
Interquartile range (IQR)202

Descriptive statistics

Standard deviation462.6253091
Coefficient of variation (CV)2.010390285
Kurtosis16.73644703
Mean230.1171631
Median Absolute Deviation (MAD)52.5
Skewness3.738452153
Sum1622326
Variance214022.1767
MonotocityNot monotonic
2021-05-02T16:15:46.004704image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1131
 
1.9%
2124
 
1.8%
0121
 
1.7%
14121
 
1.7%
3116
 
1.6%
12112
 
1.6%
18111
 
1.6%
11106
 
1.5%
10101
 
1.4%
1997
 
1.4%
Other values (1057)5910
83.8%
ValueCountFrequency (%)
0121
1.7%
1131
1.9%
2124
1.8%
3116
1.6%
478
1.1%
ValueCountFrequency (%)
47101
< 0.1%
44101
< 0.1%
43152
< 0.1%
41142
< 0.1%
40941
< 0.1%

num_comments
Real number (ℝ≥0)

ZEROS

Distinct993
Distinct (%)14.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean224.3560284
Minimum0
Maximum20990
Zeros2119
Zeros (%)30.1%
Memory size55.2 KiB
2021-05-02T16:15:46.096727image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median4
Q323
95-th percentile1210.65
Maximum20990
Range20990
Interquartile range (IQR)23

Descriptive statistics

Standard deviation889.6368195
Coefficient of variation (CV)3.965290463
Kurtosis126.8628701
Mean224.3560284
Median Absolute Deviation (MAD)4
Skewness9.028850488
Sum1581710
Variance791453.6706
MonotocityNot monotonic
2021-05-02T16:15:46.185059image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02119
30.1%
1564
 
8.0%
2364
 
5.2%
3309
 
4.4%
4249
 
3.5%
5213
 
3.0%
6188
 
2.7%
7154
 
2.2%
8136
 
1.9%
9133
 
1.9%
Other values (983)2621
37.2%
ValueCountFrequency (%)
02119
30.1%
1564
 
8.0%
2364
 
5.2%
3309
 
4.4%
4249
 
3.5%
ValueCountFrequency (%)
209901
< 0.1%
190131
< 0.1%
174041
< 0.1%
120031
< 0.1%
109601
< 0.1%

num_shares
Real number (ℝ≥0)

ZEROS

Distinct501
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.02255319
Minimum0
Maximum3424
Zeros3911
Zeros (%)55.5%
Memory size55.2 KiB
2021-05-02T16:15:46.273355image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q34
95-th percentile260.1
Maximum3424
Range3424
Interquartile range (IQR)4

Descriptive statistics

Standard deviation131.5999655
Coefficient of variation (CV)3.288145183
Kurtosis96.8629404
Mean40.02255319
Median Absolute Deviation (MAD)0
Skewness7.099332142
Sum282159
Variance17318.55092
MonotocityNot monotonic
2021-05-02T16:15:46.355331image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03911
55.5%
1820
 
11.6%
2320
 
4.5%
3171
 
2.4%
4113
 
1.6%
590
 
1.3%
674
 
1.0%
754
 
0.8%
835
 
0.5%
935
 
0.5%
Other values (491)1427
 
20.2%
ValueCountFrequency (%)
03911
55.5%
1820
 
11.6%
2320
 
4.5%
3171
 
2.4%
4113
 
1.6%
ValueCountFrequency (%)
34241
< 0.1%
21391
< 0.1%
16361
< 0.1%
16181
< 0.1%
14301
< 0.1%

num_likes
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct1044
Distinct (%)14.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean215.0431206
Minimum0
Maximum4710
Zeros126
Zeros (%)1.8%
Memory size55.2 KiB
2021-05-02T16:15:46.455360image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q117
median58
Q3184.75
95-th percentile1160.1
Maximum4710
Range4710
Interquartile range (IQR)167.75

Descriptive statistics

Standard deviation449.4723571
Coefficient of variation (CV)2.0901499
Kurtosis18.42703221
Mean215.0431206
Median Absolute Deviation (MAD)50
Skewness3.91912765
Sum1516054
Variance202025.3998
MonotocityNot monotonic
2021-05-02T16:15:46.540186image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1128
 
1.8%
2127
 
1.8%
0126
 
1.8%
14124
 
1.8%
12120
 
1.7%
3118
 
1.7%
10110
 
1.6%
18106
 
1.5%
1997
 
1.4%
1197
 
1.4%
Other values (1034)5897
83.6%
ValueCountFrequency (%)
0126
1.8%
1128
1.8%
2127
1.8%
3118
1.7%
476
1.1%
ValueCountFrequency (%)
47101
< 0.1%
43151
< 0.1%
42412
< 0.1%
40941
< 0.1%
40542
< 0.1%

num_loves
Real number (ℝ≥0)

ZEROS

Distinct229
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.72865248
Minimum0
Maximum657
Zeros4230
Zeros (%)60.0%
Memory size55.2 KiB
2021-05-02T16:15:46.637743image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile77
Maximum657
Range657
Interquartile range (IQR)3

Descriptive statistics

Standard deviation39.97293011
Coefficient of variation (CV)3.140389775
Kurtosis50.57163221
Mean12.72865248
Median Absolute Deviation (MAD)0
Skewness6.004845077
Sum89737
Variance1597.835141
MonotocityNot monotonic
2021-05-02T16:15:46.711506image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04230
60.0%
1611
 
8.7%
2282
 
4.0%
3213
 
3.0%
4131
 
1.9%
5108
 
1.5%
680
 
1.1%
772
 
1.0%
847
 
0.7%
943
 
0.6%
Other values (219)1233
 
17.5%
ValueCountFrequency (%)
04230
60.0%
1611
 
8.7%
2282
 
4.0%
3213
 
3.0%
4131
 
1.9%
ValueCountFrequency (%)
6571
< 0.1%
5291
< 0.1%
5041
< 0.1%
4851
< 0.1%
4822
< 0.1%

num_wows
Real number (ℝ≥0)

ZEROS

Distinct65
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.289361702
Minimum0
Maximum278
Zeros5308
Zeros (%)75.3%
Memory size55.2 KiB
2021-05-02T16:15:46.802486image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4
Maximum278
Range278
Interquartile range (IQR)0

Descriptive statistics

Standard deviation8.71965038
Coefficient of variation (CV)6.762765147
Kurtosis415.5921273
Mean1.289361702
Median Absolute Deviation (MAD)0
Skewness18.24681302
Sum9090
Variance76.03230276
MonotocityNot monotonic
2021-05-02T16:15:46.891168image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05308
75.3%
1704
 
10.0%
2347
 
4.9%
3216
 
3.1%
4137
 
1.9%
577
 
1.1%
655
 
0.8%
729
 
0.4%
828
 
0.4%
919
 
0.3%
Other values (55)130
 
1.8%
ValueCountFrequency (%)
05308
75.3%
1704
 
10.0%
2347
 
4.9%
3216
 
3.1%
4137
 
1.9%
ValueCountFrequency (%)
2781
< 0.1%
2521
< 0.1%
2061
< 0.1%
2001
< 0.1%
1771
< 0.1%

num_hahas
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct42
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6964539007
Minimum0
Maximum157
Zeros5916
Zeros (%)83.9%
Memory size55.2 KiB
2021-05-02T16:15:46.980479image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4
Maximum157
Range157
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.957183443
Coefficient of variation (CV)5.681902907
Kurtosis587.181095
Mean0.6964539007
Median Absolute Deviation (MAD)0
Skewness20.30574123
Sum4910
Variance15.6593008
MonotocityNot monotonic
2021-05-02T16:15:47.062192image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
05916
83.9%
1399
 
5.7%
2228
 
3.2%
3149
 
2.1%
4100
 
1.4%
564
 
0.9%
634
 
0.5%
824
 
0.3%
720
 
0.3%
917
 
0.2%
Other values (32)99
 
1.4%
ValueCountFrequency (%)
05916
83.9%
1399
 
5.7%
2228
 
3.2%
3149
 
2.1%
4100
 
1.4%
ValueCountFrequency (%)
1571
< 0.1%
1021
< 0.1%
1001
< 0.1%
971
< 0.1%
911
< 0.1%

num_sads
Real number (ℝ≥0)

ZEROS

Distinct24
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2436879433
Minimum0
Maximum51
Zeros6443
Zeros (%)91.4%
Memory size55.2 KiB
2021-05-02T16:15:47.152536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum51
Range51
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.59715594
Coefficient of variation (CV)6.554103244
Kurtosis427.0720932
Mean0.2436879433
Median Absolute Deviation (MAD)0
Skewness17.57886772
Sum1718
Variance2.550907095
MonotocityNot monotonic
2021-05-02T16:15:47.228998image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
06443
91.4%
1321
 
4.6%
2113
 
1.6%
364
 
0.9%
437
 
0.5%
514
 
0.2%
612
 
0.2%
812
 
0.2%
106
 
0.1%
76
 
0.1%
Other values (14)22
 
0.3%
ValueCountFrequency (%)
06443
91.4%
1321
 
4.6%
2113
 
1.6%
364
 
0.9%
437
 
0.5%
ValueCountFrequency (%)
511
< 0.1%
462
< 0.1%
371
< 0.1%
281
< 0.1%
232
< 0.1%

num_angrys
Real number (ℝ≥0)

ZEROS

Distinct14
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1131914894
Minimum0
Maximum31
Zeros6627
Zeros (%)94.0%
Memory size55.2 KiB
2021-05-02T16:15:47.304687image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum31
Range31
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7268118907
Coefficient of variation (CV)6.421082493
Kurtosis624.7529455
Mean0.1131914894
Median Absolute Deviation (MAD)0
Skewness19.50712917
Sum798
Variance0.5282555244
MonotocityNot monotonic
2021-05-02T16:15:47.499490image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
06627
94.0%
1276
 
3.9%
271
 
1.0%
335
 
0.5%
417
 
0.2%
59
 
0.1%
64
 
0.1%
83
 
< 0.1%
72
 
< 0.1%
192
 
< 0.1%
Other values (4)4
 
0.1%
ValueCountFrequency (%)
06627
94.0%
1276
 
3.9%
271
 
1.0%
335
 
0.5%
417
 
0.2%
ValueCountFrequency (%)
311
< 0.1%
192
< 0.1%
121
< 0.1%
101
< 0.1%
91
< 0.1%

status_link
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.2 KiB
0
6987 
1
 
63

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters7050
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
06987
99.1%
163
 
0.9%
2021-05-02T16:15:47.658296image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-05-02T16:15:47.706315image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
06987
99.1%
163
 
0.9%

Most occurring characters

ValueCountFrequency (%)
06987
99.1%
163
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number7050
100.0%

Most frequent character per category

ValueCountFrequency (%)
06987
99.1%
163
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common7050
100.0%

Most frequent character per script

ValueCountFrequency (%)
06987
99.1%
163
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII7050
100.0%

Most frequent character per block

ValueCountFrequency (%)
06987
99.1%
163
 
0.9%

status_photo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.2 KiB
1
4288 
0
2762 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters7050
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row1
ValueCountFrequency (%)
14288
60.8%
02762
39.2%
2021-05-02T16:15:47.845911image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-05-02T16:15:47.896081image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
14288
60.8%
02762
39.2%

Most occurring characters

ValueCountFrequency (%)
14288
60.8%
02762
39.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number7050
100.0%

Most frequent character per category

ValueCountFrequency (%)
14288
60.8%
02762
39.2%

Most occurring scripts

ValueCountFrequency (%)
Common7050
100.0%

Most frequent character per script

ValueCountFrequency (%)
14288
60.8%
02762
39.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII7050
100.0%

Most frequent character per block

ValueCountFrequency (%)
14288
60.8%
02762
39.2%

status_status
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.2 KiB
0
6685 
1
 
365

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters7050
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
06685
94.8%
1365
 
5.2%
2021-05-02T16:15:48.025954image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-05-02T16:15:48.073977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
06685
94.8%
1365
 
5.2%

Most occurring characters

ValueCountFrequency (%)
06685
94.8%
1365
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number7050
100.0%

Most frequent character per category

ValueCountFrequency (%)
06685
94.8%
1365
 
5.2%

Most occurring scripts

ValueCountFrequency (%)
Common7050
100.0%

Most frequent character per script

ValueCountFrequency (%)
06685
94.8%
1365
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII7050
100.0%

Most frequent character per block

ValueCountFrequency (%)
06685
94.8%
1365
 
5.2%

Interactions

2021-05-02T16:15:38.987225image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.067254image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.231954image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.319332image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.411838image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.498439image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.588769image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.676491image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.754671image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.835741image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.919105image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:39.997698image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.078007image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.158438image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.241957image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.322431image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.400940image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.487765image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.575931image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.656972image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.748843image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.838692image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:40.930182image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.012188image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.094714image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.166333image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.242090image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.327552image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.403070image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.487545image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.573051image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.648936image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.728700image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.896833image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:41.976832image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.064831image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.143155image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.223387image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.313483image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.401775image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.486301image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.564977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.646747image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.733465image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.805690image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.887743image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:42.963671image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.039824image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.121397image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.201632image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.289910image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.378202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.458461image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.545936image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.637446image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.727385image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.817218image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.898216image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:43.971706image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.060766image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.137503image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.222070image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.302335image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.397035image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.473848image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.555434image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.643712image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.732086image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.815736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:44.904778image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:45.079504image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-02T16:15:45.173762image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-05-02T16:15:48.124496image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-02T16:15:48.266136image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-02T16:15:48.421933image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-02T16:15:48.583444image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-05-02T16:15:48.717014image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-05-02T16:15:45.340736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-05-02T16:15:45.535043image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

status_idnum_reactionsnum_commentsnum_sharesnum_likesnum_lovesnum_wowsnum_hahasnum_sadsnum_angrysstatus_linkstatus_photostatus_status
0246675545449582_1649696485147474529512262432923110000
1246675545449582_16494269885077571500015000000010
2246675545449582_164873058857739722723657204211100000
3246675545449582_16485767052594521110011100000010
4246675545449582_16457005022137392130020490000010
5246675545449582_16456501622187732176021151000010
6246675545449582_1645564175560705503614724187010203000
7246675545449582_164482466563465629545353260321101000
8246675545449582_16446557956515432031019850000010
9246675545449582_16387883795716181709116730000010

Last rows

status_idnum_reactionsnum_commentsnum_sharesnum_likesnum_lovesnum_wowsnum_hahasnum_sadsnum_angrysstatus_linkstatus_photostatus_status
70401050855161656896_10630710504353079326349030000010
70411050855161656896_1062020473873698900720000010
70421050855161656896_1061944223881323400400000010
70431050855161656896_10619181838839271962319510000010
70441050855161656896_106190662055175086008600000010
70451050855161656896_106186347055606589008900000010
70461050855161656896_106133475727560316001410100010
70471050855161656896_1060126464063099200110000010
70481050855161656896_1058663487542730351122234920000010
70491050855161656896_105085884165652817001700000010

Duplicate rows

Most frequent

status_idnum_reactionsnum_commentsnum_sharesnum_likesnum_lovesnum_wowsnum_hahasnum_sadsnum_angrysstatus_linkstatus_photostatus_statuscount
0246675545449582_32688345076212421120211000000102
1246675545449582_429583263825475537161537000000102
2819700534875473_1000607730118085170421316851522000102
3819700534875473_100198251998060625574249600000102
4819700534875473_10023727332749183762033541930000002
5819700534875473_95161460501739898571429625162000102
6819700534875473_95304822154070319853921196111120100102
7819700534875473_9543871514068101861511723110000102
8819700534875473_95514910133061511461108330000102
9819700534875473_95574312460454687916518867480000102